77 research outputs found

    Emotional quantification of soundscapes by learning between samples

    Get PDF
    Predicting the emotional responses of humans to soundscapes is a relatively recent field of research coming with a wide range of promising applications. This work presents the design of two convolutional neural networks, namely ArNet and ValNet, each one responsible for quantifying arousal and valence evoked by soundscapes. We build on the knowledge acquired from the application of traditional machine learning techniques on the specific domain, and design a suitable deep learning framework. Moreover, we propose the usage of artificially created mixed soundscapes, the distributions of which are located between the ones of the available samples, a process that increases the variance of the dataset leading to significantly better performance. The reported results outperform the state of the art on a soundscape dataset following Schafer\u2019s standardized categorization considering both sound\u2019s identity and the respective listening context

    A Concept Drift-Aware DAG-Based Classification Scheme for Acoustic Monitoring of Farms

    Get PDF
    Intelligent farming as part of the green revolution is advancing the world of agriculture in such a way that farms become dynamic, with the overall scope being the optimization of animal production in an eco-friendly way. In this direction, this study proposes exploiting the acoustic modality for farm monitoring. Such information could be used in a stand-alone or complimentary mode to monitor the farm constantly at a great level of detail. To this end, the authors designed a scheme classifying the vocalizations produced by farm animals. More precisely, a directed acyclic graph was proposed, where each node carries out a binary classification task using hidden Markov models. The topological ordering follows a criterion derived from the Kullback-Leibler divergence. In addition, a transfer learning-based module for handling concept drifts was proposed. During the experimental phase, the authors employed a publicly available dataset including vocalizations of seven animals typically encountered in farms, where promising recognition rates were reported

    Automatic acoustic classification of insect species based on directed acyclic graphs

    Get PDF
    This work presents the design of a directed acyclic graph (DAG) scheme, the nodes of which incorporate hidden Markov models (HMMs) for classifying insect species. Such a DAG scheme is able to limit the problem space, while having the HMMs capture the temporal evolution of Mel-scaled spectrograms extracted out of wingbeat sounds. Interestingly, the proposed approach offers interpretability of the classification process by inspecting the sequence of edges activated in the DAG (path). The dataset encompasses 50 000 wingbeat sounds representing six species, i.e., Ae. aegypti (male and female), Cx. quinquefasciatus (male and female), Cx. stigmatosoma (male and female), Cx. tarsalis (male and female), Musca domestica, and Drosophila simulans, and is publicly available at https://sites.google.com/site/insectclassification/. Thorough species classification experiments showed that the proposed solution outperforms state-of-the-art approache

    A transfer learning framework for predicting the emotional content of generalized sound events

    Get PDF
    Predicting the emotions evoked by generalized sound events is a relatively recent research domain which still needs attention. In this work a framework aiming to reveal potential similarities existing during the perception of emotions evoked by sound events and songs is presented. To this end the following are proposed: (a) the usage of temporal modulation features, (b) a transfer learning module based on an echo state network, and (c) a k-medoids clustering algorithm predicting valence and arousal measurements associated with generalized sound events. The effectiveness of the proposed solution is demonstrated after a thoroughly designed experimental phase employing both sound and music data. The results demonstrate the importance of transfer learning in the specific field and encourage further research on approaches which manage the problem in a synergistic way

    Automatic detection of cow/calf vocalizations in free-stall barn

    Get PDF
    Precision livestock farming dictates the use of advanced technologies to understand, analyze, assess and finally optimize a farm\u2019s production collectively as well as the contribution of each single animal. This work is part of a research project wishing to steer the dairy farms\u2019 producers to more ethical rearing systems. To study cow\u2019s welfare, we focus on reciprocal vocalizations including mother-offspring contact calls. We show the set-up of a suitable audio capturing system composed of automated recording units and propose an algorithm to automatically detect cow vocalizations in an indoor farm setting. More specifically, the algorithm has a two-level structure: a) first, the Hilbert follower is applied to segment the raw audio signals, and b) second the detected blocks of acoustic activity are refined via a classification scheme based on hidden Markov models. After thorough evaluation, we demonstrate excellent detection results in terms of false positives, false negatives and confusion matrix

    Preservation and Promotion of Opera Cultural Heritage: The Experience of La Scala Theatre

    Get PDF
    This paper focuses on music and music-related cultural heritage typically preserved by opera houses, starting from the experience achieved during the long-lasting collaboration between La Scala theater and the Laboratory of Music Informatics of the University of Milan. First, we will mention the most significant results achieved by the project in the fields of preservation, information retrieval and dissemination of cultural heritage through computer-based approaches. Moreover, we will discuss the possibilities offered by new technologies applied to the conservative context of an opera house, including: the multi-layer representation of music information to foster the accessibility of musical content also by non-experts; the adoption of 5G networks to deliver spherical videos of live events, thus opening new scenarios for cultural heritage enjoyment and dissemination; deep learning approaches both to improve internal processes (e.g., back-office applications for music information retrieval) and to offer advanced services to users (e.g., highly-customized experiences)

    Gaussian mixture modeling for detecting integrity attacks in smart grids

    Get PDF
    The thematics focusing on inserting intelligence in cyber-physical critical infrastructures (CI) have been receiving a lot of attention in the recent years. This paper presents a methodology able to differentiate between the normal state of a system composed of interdependent infrastructures and states that appear to be normal but the system (or parts of it) has been compromised. The system under attack seems to operate properly since the associated measurements are simply a variation of the normal ones created by the attacker, and intended to mislead the operator while the consequences may be of catastrophic nature. Here, we propose a holistic modeling scheme based on Gaussian mixture models estimating the probability density function of the parameters coming from linear time invariant (LTI) models. LTI models are approximating the relationships between the datastreams coming from the CI. The experimental platform includes a power grid simulator of the IEEE 30 bus model controlled by a cyber network platform. Subsequently, we implemented a wide range of integrity attacks (replay, ramp, pulse, scaling, and random) with different intensity levels. An extensive experimental campaign was designed and we report satisfying detection results

    Transfer Learning for Improved Audio-Based Human Activity Recognition

    Get PDF
    Human activities are accompanied by characteristic sound events, the processing of which might provide valuable information for automated human activity recognition. This paper presents a novel approach addressing the case where one or more human activities are associated with limited audio data, resulting in a potentially highly imbalanced dataset. Data augmentation is based on transfer learning; more specifically, the proposed method: (a) identifies the classes which are statistically close to the ones associated with limited data; (b) learns a multiple input, multiple output transformation; and (c) transforms the data of the closest classes so that it can be used for modeling the ones associated with limited data. Furthermore, the proposed framework includes a feature set extracted out of signal representations of diverse domains, i.e., temporal, spectral, and wavelet. Extensive experiments demonstrate the relevance of the proposed data augmentation approach under a variety of generative recognition schemes
    • …
    corecore